On the selection of thresholds for predicting species occurrence with presence‐only data

نویسندگان

  • Canran Liu
  • Graeme Newell
  • Matt White
چکیده

Presence-only data present challenges for selecting thresholds to transform species distribution modeling results into binary outputs. In this article, we compare two recently published threshold selection methods (maxSSS and maxF pb) and examine the effectiveness of the threshold-based prevalence estimation approach. Six virtual species with varying prevalence were simulated within a real landscape in southeastern Australia. Presence-only models were built with DOMAIN, generalized linear model, Maxent, and Random Forest. Thresholds were selected with two methods maxSSS and maxF pb with four presence-only datasets with different ratios of the number of known presences to the number of random points (KP-RP ratio). Sensitivity, specificity, true skill statistic, and F measure were used to evaluate the performance of the results. Species prevalence was estimated as the ratio of the number of predicted presences to the total number of points in the evaluation dataset. Thresholds selected with maxF pb varied as the KP-RP ratio of the threshold selection datasets changed. Datasets with the KP-RP ratio around 1 generally produced better results than scores distant from 1. Results produced by We conclude that maxFpb had specificity too low for very common species using Random Forest and Maxent models. In contrast, maxSSS produced consistent results whichever dataset was used. The estimation of prevalence was almost always biased, and the bias was very large for DOMAIN and Random Forest predictions. We conclude that maxF pb is affected by the KP-RP ratio of the threshold selection datasets, but maxSSS is almost unaffected by this ratio. Unbiased estimations of prevalence are difficult to be determined using the threshold-based approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل‌سازی پراکنش قوچ و میش اصفهان در منطقه حفاظت شده تنگ صیاد براساس بهبود اریب داده‌های حضور و انتخاب متغیرهای مناسب با استفاده از حداکثر آنتروپی

This study employs the maximum entropy modelling technique to investigate the geographic distribution pattern of wild sheep (Ovis Orientalis) on Tangeh Sayyad Proteced Area. A set of eight environmental predictors is employed together with presence-only records of wild sheep. Two methods has been used to improve the performance of modeling: density-based occurrence thinning and performance-base...

متن کامل

تعیین آستانۀ بهینۀ حضور در مدل‌های پیش‌بینی پراکنش گونه‌های گیاهی (مطالعۀ موردی: مراتع منطقۀ نیر استان یزد)

The current study addresses determination of occurrence optimal thresholds of predictive models of plant species distribution in Nir rangelands of Yazd province. Accordingly, after determination of homogeneous units using digital elevation model and geology maps with scale 1:25000, vegetation sampling was carried out using random systematic method via plots which establishment across 3-5 transe...

متن کامل

Application of genetic algorithm (GA) to select input variables in support vector machine (SVM) for analyzing the occurrence of roach, Rutilus rutilus, in streams

Support vector machine (SVM) was used to analyze the occurrence of roach in Flemish stream basins (Belgium). Several habitat and physico?chemical variables were used as inputs for the model development. The biotic variable merely consisted of abundance data which was used for predicting presence/absence of roach. Genetic algorithm (GA) was combined with SVM in order to select the most important...

متن کامل

Predicting the distribution of plant species using logistic regression (Case study: Garizat rangelands of Yazd province)

The aim of this research was to study the relationships between presence of plant species and environmental factors in Garizat rangelands of Yazd province and providing their predictive habitat models. After delimitation of the study area, sampling was performed using randomized-systematic method. Accordingly, vegetation data including presence and cover percentage were determined in each quadr...

متن کامل

Comparison of Indicators for Determining the Thresholds of Banks' Financial Crisis in EWS Based on Business Cycles

 The purpose of this paper is to design a prediction system for thresholds of the bankruptcy of banks based on the business cycle and examine the effects of different approaches in defining the bankruptcy threshold in predicting bankruptcy time of Iranian banks using the Kaplan-Meier and Cox Proportional-Hazards Models. So, the data of listed banks in Tehran Stock Exchange were used from 1385-1...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 6  شماره 

صفحات  -

تاریخ انتشار 2016